A new audio coding scheme using a forward masking model and perceptually weighted vector quantization

نویسندگان

  • Yuan-Hao Huang
  • Tzi-Dar Chiueh
چکیده

This paper presents a new audio coder that includes two techniques to improve the sound quality of the audio coding system. First, a forward masking model is proposed. This model exploits adaptation of the peripheral sensory and neural elements in the auditory system, which is often deemed as the cause of forward masking. In the proposed audio coder, the forward masking is first modeled by a nonlinear analog circuit and then difference equations for finding the solution of this circuit are formulated. The parameters of the circuit are derived from several factors, including time difference between masker and maskee, masker level, masker frequency, and masker duration. Inclusion of this model in the coding process will remove more redundancy inaudible to humans and thus improves coding efficiency. Secondly, we propose a new vector quantization technique, whose codebooks are generated by a perceptually weighted binary-tree self-organizing feature maps (PW-BTSOFM) algorithm. This vector quantization technique adopts a perceptually weighted error criterion to train and select codewords so that the quantization error is kept below the just-noticed distortion (JND) while using the smallest possible codebook, again reducing the required coded bit rate. Experimental objective and subjective sound quality measurements show that the proposed audio coding scheme requires about 30% less bits than the MPEG layer III audio coding standard.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving perceptual coding of narrowband audio signals at low rates

This paper discusses perceptual coding of narrowband audio signals at low rates. In particular, it proposes a new error measure which shapes the noise inside the critical bands, a window switching criterion based on the temporal masking effect of the hearing system, a more accurate model of the simultaneous masking effect of the hearing system, perceptually-based bit allocation algorithms based...

متن کامل

Perceptual Coding of Narrowband Audio Signals at 8 Kbit / S *

This paper proposes a VQ-based transform coding scheme for audio signals ( sampled at 8 kHz) at very low bit rates. This coder uses a new perceptually based distortion measure, which takes into account the energy of audible noise, in both training the codebooks and selecting the best codew ords.An adaptive bit allocation strategy based on the distribution of the energy of the transform coe cien...

متن کامل

Fast encoding algorithms for MPEG-4 TwinVQ audio tool

The ISO/IEC MPEG-4 Audio standard includes the TwinVQ encoding tool. This tool is suitable for low-bit-rate general audio coding, but drawback is the computational complexity of the encoder. To develop a faster TwinVQ encoder, new fast vector quantization algorithms — area localized pre-selection and hit zone masking — are introduced. These algorithms exploit preand main-selection procedure sch...

متن کامل

Improved noise-robustness in distributed speech recognition via perceptually-weighted vector quantisation of filterbank energies

In this paper, we examine a coding scheme for quantising feature vectors in a distributed speech recognition environment that is more robust to noise. It consists of a vector quantiser that operates on the logarithmic filterbank energies (LFBEs). Through the use of a perceptually-weighted Euclidean distance measure, which emphasises the LFBEs that represent the spectral peaks, the vector quanti...

متن کامل

Low Complexity, Low Delay and Scalable Audio Coding Scheme Based on a Novel Statistical Perceptual Quantization Procedure

In this paper we present Fast Perceptual Quantization (FPQ), a novel procedure to quantize and code audio signals. It employs the same psychoacoustics principles used in the popular MPEG/Audio coders, but substantially simplifies the complexity and computational needs of the encoding process. FPQ is based on defining a hierarchy of privileged quantization values so that the masking threshold ca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Trans. Speech and Audio Processing

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2002